This study involves all current Pokémon (1008), from its 9 generation of games. We consider also the existing variations and transformations.
We won't dive deep in the games' mechanics, but we will present here key words for some of their classifications.
| Generation | Core Games | Region | New Pokémon | Platform |
|---|---|---|---|---|
| I | Red / Green / Blue / Yellow | Kanto | 151 | Game Boy |
| II | Gold / Silver / Crystal | Johto | 100 | Game Boy Color |
| III | Ruby / Sapphire / Emerald | Hoenn | 135 | Game Boy Advance |
| IV | Diamond / Pearl / Platinum | Sinnoh | 107 | Nintendo DS |
| V | Black / White / Black2 / White2 | Unova | 156 | Nintendo DS |
| VI | X / Y | Kalos | 72 | Nintendo 3DS |
| VII | Sun / Moon / Ultra Sun / Ultra Moon | Alola | 88 | Nintendo 3DS |
| VIII | Sword / Shield / Legends:Arceus | Galar / Hisui | 96 | Nintendo Switch |
| IX | Scarlet / Violet | Paldea | 103 | Nintendo Switch |
Each Pokémon can belong to 1 or 2 different types. Each battle move also belongs to a type.
Every type has its own strength or weakness over other specific types.
There are a total of 18 types:
There are specific groups, which some Pokémon may belong to.
Most of them are related to rarity and power of these Pokémon.
The last two presented below refer to battle mechanics, which can transform the base stats of the Pokémon.
Some Pokémon have regional forms, which transform their appearence as well as their types, movesets and base-stats.
Each Pokémon has 6 base stats, which are the base values for their permanent stats.
There are also two in-battle stats: Evasion and Accuracy, which are not considered here.
These base stats are the base values, and are used to calculate, together with all modifiers: level, nature, individual value (IV), effort value and awakening value to get to each Pokémon's final stats.
The base stats are:
We start by applying some simple analysis on the Pokemon six base stats.
Below we show radar charts for each of the 18 types, showing their mean value over each stat.
radar_charts_type()
Analyzing each type chart, we can see already some interesting aspects:
Now we proceed by showing the stats charts by comparing common Pokémon to the strong groups, which are Legendary, Mythical, Ultra Beasts and two battle mechanics: Mega Evolutions and Gigantamax.
radar_charts_group()
Moving on, we show below Pokémon with highest value of each stat, as well as the one with the highest total.
show_highest('hp')
show_highest('atk')
show_highest('spatk')
show_highest('def')
show_highest('spd')
show_highest('total')
It is worth saying that we considered only those obtainable Pokémon (therefore excluding Eternamax Eternatus).
Also for highest defense, we have three Pokémon with same value: Shuckle (showed), Mega Steelix and Mega Aggron, but since Shucke has also the highest special defense, it was chosen here.
And for total stats, Mega Rayquaza, Mega Mewtwo X and Mega Mewtwo Y have the same total, but Mega Rayquaza was chosen due to its better distribution of the stats.
For a better overall representation, we now apply Principal Component Analysis (PCA) transformation to the base stats.
The PCA transformation performs mathematical matrix calculation, in order to separate the highest variance of the data. In other words, it will reproduce major part of the data in a small number of dimensions.
pca(df)
Through the first two Principal Components, we are able to represent all six base stats variances in a two dimensional space.
Let us analyze how the base stats are distributed over these two principal components.
pc(df)
We can identify some patterns.
Speed is overal oriented in the opposite direction of Defense, meaning that Pokémon with high Defense are Slow and vice-versa.
We can see also Defense is not so correlated to Special Defense, however Attack is more correlated to Special Attack, meaning it is more common to have Pokémon with both Attack and Special Attack stats high together.
With this representation we can also identify some groups behavior.
pc_type(df)
Here we analyze four distinct types:
We can also visualize Pokémon groups in PC representation.
pc_group(df)
Above figures show Legendary, Mythical, Pseudo Legendary and Mega Evolution groups in PC representation.
We can see that all four groups are located in the right side, indicating overall they have higher stats.
Our last analysis on PC representation is about generation. Generation 6 has a different behavior from the others, it is the one that introduced Mega Evolutions. Due to this, it is more grouped to the right side. The other generations have a more sparse distribution, like the one to the right, corresponding to generation 8.
pc_gen(df)
Up until now we only used the base stats of Pokémon for the analysis.
However there is an another main feature, which is mandatory to the battles, which is the moves the Pokémon can use.
There is a total of 900 moves, that have a series of features like type, category, power, accuracy and secondary effects. In fact our table of moves has 900 rows and 87 columns.
Each Pokémon has a limited list of moves it can learn, while it actually can only use and memorize 4 at the same time. We won't dive deeper in the moves details.
Representing each Pokémon with its possible moves stats, we have a high dimensional data (88181 rows by 110 columns).
In order to represent this, we will apply the t-SNE (t-Distributed Stochastic Neighbor Embedding) dimension reduction, which produces a two dimensional visualization of the entire data.
The t-SNE is well suited for high-dimension visualizations and captures local similarities, performing non-linear transformations on the data.
For this transformation we use the useful informations, which are the Pokémon base stats, type, and moves it can learn.
tsne()
The t-SNE presented here captures information from all Pokémon and their respective moves.
In this visualization, it shows that higher stats are distributed around the space, closer to its borders, while points closer to the center correspond to lower stats.
Investigating the Pokémon groups we can confirm again what was presented with PCA. Legendary and Mega Evolutions are closer to the border because they represent stronger Pokémon, with higher stats.
tsne_group()
With this representation we are able to perform analysis on types as well.
Below we compare the Pokémon type against moves types for the three initial types: Grass, Fire and Water.
tsne_type('grass')
tsne_type('fire')
tsne_type('water')
With the figures presented above, we can see that Pokémon with initial types (grass, fire and water) are well spread around the space, indicating they assume varied behavior range.
However fire type moves are less sparse, indicating they have more similar behavior than the other two types.
Below we present the representation for Normal, Psychic and Dragon types.
tsne_type('normal')
tsne_type('psychic')
tsne_type('dragon')
Normal type is the most common, for both Pokémon and moves.
In the right graph we can see there are a lot more points, indicating Pokémon from other types that also have normal type moves.
Psychic type is presented here because they showed a different behavior: both Pokémon and move are more close to the border than other types. This indicates that Psychic Pokémon and moves are slightly stronger than the other types.
While there is a significant number of Dragon Pokémon, we have just a small number of Dragon type moves, and they are much more clustered together, indicating their similar behavior.